Summary.Net
The Next Step in Information Technology

Glossary
Comments and questions: Summary@Summary.Net

1 Page Visit - A visit that only requests a single non-graphic.

Agent - There is always some piece of software acting as the "agent" for the user making the request. There is a standard way for that software to tell the web server it's name, version number, and possibly other information.

The agent information is not always put into the log file. NCSA Combined logs contain it. WebSTAR logs agent information when the "AGENT" or "CS(USER-AGENT)" tokens are included.

Many web browsers "lie" about their identity. Some web browsers can be directly configured by the end user to send what ever string the user wants. Microsoft Internet Explorer will normally "pretend" to be Netscape Navigator, as in "Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)", which starts with "Mozilla" the standard agent tag for Netscape Navigator. But it then indicates who it really is with the "compatible; MSIE 4.01;" portion. Some browsers just lie outright, claiming to be Netscape Navigator and not giving any hints that that isn't true.

Auth User - The server can be configured to require authorization, the entry of a user name and password, to access a page. The Auth User is simply the string typed into the name field of the authorization dialog. This name is not present for pages which are freely accessible and does not nessesairly have anything to do with the actual name of the person making the request. NCSA Common and Combined logs both have the Auth User name. WebSTAR logs this name when the USER token is in the log format.

BPS - Bits per second. This is a rate of data transfer, common modems are capable of 33.6K or 56K BPS. A T1 line is 1.5 Meg BPS.

Browser - The name of the web browser used to make the request. This is derived from the agent string and suffers some of the same "lying" issues it does. Summary decodes the standard methods of partially hiding the identity of the browser in most cases.

CGI Arguments - A URL can optional contain a question mark, the portion of the URL to the right of the question mark is often referrer to as the search argument or more generally as the CGI argument. This portion is traditionally passed to a CGI program for interpretation.

Cookie - A web server can send "cookie" information down to a web browser, which will then supply that information back along with each request. Many end users disable this feature of their web browsers however. WebSTAR logs cookie information when the "CS(COOKIE)" token is included.

Curr - Current, the "current" time period. Normally used as a prefix as in "Curr Hits" which means hits in the current time period. The length of the current time period is configurable, generally one week.

Destination - The next non-graphic request after the current one in a visit.

Domain - On the internet most computers are given names which can be used to access them over the internet. These names are called domain names. Domain names consist of two or more parts separated by periods, for example "summary.net". You can refer to all of the computers that share some right hand portion of a name as being in the same domain, for example "www.summary.net" and "mail.summary.net" are both in the "summary.net" domain. In Summary the domain name is considered to be the right most two or three segments of the name. Summary decides when to use two and when to use three segments in an attempt to match the domain to a company or organization, more segments might typically refer to a single computer, fewer to a country.

Download - A request for a file that is stored or decoded into a file in the users file system, as opposed to being displayed on the screen as part of a web page. Summary uses the file name extension to determine if a request is a download. The set of extensions used to make this determination can be configured.

Enter - The first non-graphic requested as part of a visit.

Error - A request which resulted in an error code being sent to the browser. The most common error is 404 - File Not Found. Any result code 400 or higher is treated as an error.

Exit - The last non-graphic requested as part of a visit.

File Type - The file name extension is taken to indicate the file type in Summary.

Graphic - A request for a file containing an image. Summary uses the file name extension to determine if a request is for a graphic. The set of extensions used to make this determination can be configured. Requests for graphics are not counted as steps in a visit.

Hit - A single request is often called a "hit" on the web site. Saying there were "56 hits" on an item means that there were 56 separate requests for that item. The item may be a specific file, a particular referrer, or other use of a resource by a single request.

Host - A computer is often referred to as a host when talking about networking. Each computer is assigned a unique IP address. There are some exceptions, where several computers will share a single IP address. In Summary, each unique IP address is referred to as a host.

Local Referrer - A referrer is local to a site if it is in the same domain or in a domain which is equivelant to the domain that the associated request is in.

Method - Each request must contain a method. The most common method is "GET", which means simply get the requested item. A "HEAD" request means to get information about the item, such as size and last date modified. A browser will often keep copies of items in their cache and then use a "HEAD" method to check if the item has been modified since it was put in the cache.

Others - Any request which is not for a page, graphic, or download.

Page - A request for a web page. Summary uses the file name extension to determine if a request is for a page. The set of extensions used to make this determination can be configured.

Path - A sequence of requests for non-graphics in a single visit. Summary only keeps the first three requests, the last request, and wether there were more than four requests in the path or not.

Platform - The name of the operating system and/or hardware used to make the request. This is derived from the agent string and suffers some of the same "lying" issues the agent string does. Summary decodes the most common platforms based on internal rules which work with the vast majority of requests.

Referrer - The web browser generally provides the most recent previous URL when making a request, called the referrer. There are two major kinds of referrers. A page that contains graphics will appear as the referrer for the requests for the graphics. When a user clicks on a link that points to your site, the URL of the page containing the link is sent as the referrer.

The referrer information is not always put into the log file. NCSA Combined logs contain it. WebSTAR logs referrer information when the "REFERER" or "CS(REFERER)" tokens are included.

Recent - Requests that occurred in the last several day are considered recent. The exact number of days is configurable. Summary defaults to seven days.

Reload - A request for an item followed by another request for the same item with no other requests in between in the same visit. These can be cause by the user hitting the reload button, but subsequent attempts to complete a failed download, and because requests that would otherwise have occurred were satisfied by a cache.

Request - When you type a URL into a web browser, it sends a request for the item named by that URL to the server. Request can mean the entire request or specifically the name of the item contained in the request.

Search Phrase - Summary attempts to extract the search string typed by user into one of the major search engines. The entire string is called the search phrase.

Source - The previous non-graphic request in a visit.

Steps - Each non-graphic request in a visit is counted as one step. The first request is step one, the second is step two, and so on. Steps are normally displayed as the average of many step numbers for the same item from different visits.

Top Level Domain - The last component of a domain name. For example the domain "summary.net" has a top level domain of "net". There are many two letter "country code" top level domains, and only a few longer ones. There is currently a movement to increase the number of longer, non-country, domains.

Unique Host - The number of distinct IP addresses and host names making requests. This may be used as a rough estimate of the number of distinct people, even though it does not exactly correspond to people for two major reasons (and some other minor ones). Some accesses are made through proxy machines that have a single IP address but may be in use by multiple people. Dial-up connections usually have a different IP address each time you dial-up, so a single person accessing the server over the course of several different dial-up sessions will have several IP addresses.

Virtual Server - One server may, in some cases, serve more than one web site. Summary looks at the name of the server, either from the name that the user typed into the request or through the IP address which received the request and calls that name a virtual server. In some cases these will refer to actual (as opposed to virtual) servers.

Visit - A sequence of requests all made from the same IP address, with no gap between requests exceeding a time limit (normally 30 minutes). The time limit is configurable. This normally represents a single person moving through your web site, but there can be exceptions. A proxy machine used by several people could result in several different people accessing the site from the same IP address within the time limit. It is also possible for a single person to make different requests to your site from multiple IP addresses at the same time. Both of these exceptions are rare, generally accounting for a small portion of all visits.

Web Robot - A program making a request that is not displayed to a person is though of as a Web Robot. Web robots are used for several purposes, such as search engine indexing robots, link checkers, e-mail address extractors, and update watchers. Web robots are determined from the agent string. Summary has an internal database of common known Web Robots, others will not be detected.


Summary Home Page
Copyright 1998 by Summary.Net - Updated 6/17/98